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Abstract 

This paper discusses advances, due to the work of Cai, Naik, and Sivaku- 



mar [CNS95] and GlaBer [GlaOC], in the complexity class collapses that follow if 
NP has sparse hard sets under reductions weaker than (full) truth-table reduc- 
tions. 



1 Quick Hits 



Most of this article will be devoted to presenting the work of GlaBer [ GlaOCj l . However, 



even before presenting the background and definitions for that, let us briefly note 
some improvements that follow from the work of Cai, Naik, and Sivakumar due to 
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the results discussed in the first part of this article [|GHO0| 1 . (See [GH00| for definitions 
of the terms and classes used here: USATq, FewP, Few, etc.) 



Theorem 1.1 (follows from the techniques of j\CNS95 }, as noted by \vM91§BFT9 



SivOO^ ) If SAT disjunctively reduces to a sparse set, then (3Q) [USATq £ P]. 



A proof of Theorem [O] is sketched in Section || below. This advance of Cai, 
Naik, and Sivakumar establishes immediately the following corollary in the light of 
two results discussed in the first part of this article ( [ |GH0C| ] , see there for a discussion 
of attribution of the first of these results), namely, 

1. If (3Q)[USAT Q G P] then P = Few (and thus P = UP and P = FewP). 

2. flVV8(l If (3Q)[USAT Q € P] then R = NP. 

Corollary 1.2 //SAT disjunctively reduces to a sparse set, then P = Few and R = 
NP. 



Furthermore, Arvind, Kobler, and Mundhenk [AKM96] prove that if SAT disjunc- 



tively reduces to a sparse set, then PH = P NP . However, in light of Corollary 
clearly the following can be claimed. 
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Theorem 1.3 //SAT disjunctively reduces to a sparse set, then PH = P 

2 Background and Motivation 

The study of the consequences of NP having sparse hard sets under various types of 
(polynomial-time) reductions makes one of the most interesting tales in complexity 
theory. However, we will not repeat that tale here, as many good surveys of (parts 



of) that story are available |Mah86| , [You9^ , |HOW92| , |C09711 . Instead, let us cut right 
to the chase. 

In particular, Table p] shows, for the most widely studied reductions, the strongest 
currently known consequences of NP having sparse hard sets with respect to that 
reduction (we use the definitions of | |LLS75| |, and we will, below, define some additional 
reductions). 

Table [l] brings immediately to mind the key issue: For those reduction types for 
which a P = NP conclusion is not yet known, can one achieve such a conclusion or, 
failing that, what is the strongest conclusion that one can achieve? 
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Reduction 



Consequence of the existence 
of sparse sets hard for NP 



Reference 



< p 



< 



btt 



< 



< p 

— btt(c) 
<l 

< t P t 

< P 
— T 

^RS 
— T 

<° 
— T 



< 



SN 



P = NP 
P = NP 
P = NP 
P = NP 

P R = PH, P = Few, and R = NP 
PH = ZPP NP 
PH = ZPP NP 
PH = ZPP NP 
PH = NP NP 



PH = ZPP 



NP 1 



[ Mah82|1 
!PW91 j 
[ AHH+93H 
[ AHH+93;1 
See Section |l| 

[ KW99f| 

[ KWggj , see [CHW99|1 
BCHW99H 

Implicit in ||KW99|| , see ||CHW99 | 



Table 1: Consequences of the existence of sparse sets hard for NP. <^ N , <^ , and 
< ps are respectively strong nondeterministic reductions; strong and robustly over- 
productive reductions; and robustly strong reductions (see | CHW99| l for definitions 
and discussion). 



Proving a P = NP (or even a collapse of the boolean hierarchy) result for < P t or 
<^ reductions may be difficult, or at least it will require nonrelativizable techniques, 
due to the following results. 

Theorem 2.1 ^AHH + 93jJ There is an oracle world in which NP has a < p t -complete 
tally set yet the boolean hierarchy does not collapse. 



Theorem 2.2 fiKad89d For any f(n) = a;(logn) there is an oracle world in which 
NP has sparse Turing- complete sets yet PH ^ p Np [/( n )l . 



Regarding Theorem 2.1, one should keep in mind that it is well-known that the 
following are all equivalent: 



1. NP has tally < P t -hard sets. 
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2. NP has tally <^-hard sets. 

3. NP has sparse <f t -hard sets. 

4. NP has sparse <!p-hard sets. 

Nevertheless, the gap between ZPP NP and smaller classes (P NP , P R , NP) seems 
a wide one, and suggests the importance of carefully investigating whether broad 
classes of formulas formerly having only a ZPP NP consequence (via the <£ t line of 
Table |l|) can be shown to have stronger consequences. GlaBer JGlaOO ] has achieved 
exactly this, and Section || will present the key ideas of his work. 



3 Definitions 

For an arbitrary set A we denote the characteristic function of A by xa an d the 
cardinality of A by \\A\\. We fix the alphabet S = {0,1}. We denote the set of 
all words over £ by £*, and we denote the length of a word w by \w\. We usually 
use language to refer to (possibly nonproper) subsets of E*. We call a set S C S* 
sparse if and only if there exists a polynomial p such that, for all n > 0, it holds 
that S contains at most p(n) words whose length is no greater than n. For any sets 
Si, ■ ■ ■ , Sfc C S* we call the Cartesian product S = Si x ■ ■ ■ x Sk sparse if and only 
if there exists a polynomial p such that, for all n > 0, it holds that 5 contains at 
most p(n) elements (u>i, . . . , lOfc) that satisfy max{|u>i[, . . . , \wp.\} < n. When dealing 
with machines we always talk about the deterministic version unless nondeterminism 
is stated explicitly. We call an algorithm a A5? algorithm if it works in polynomial 
time and if it has access to a SAT oracle. 

In boolean formulas, v denotes the negation of the variable v. An anti-Horn 
formula is a boolean formula in conjunctive normal form such that each conjunct 
contains at most one negative literal. We will be particularly concerned with k- 
anti-Horn formulas; these are, by definition, anti-Horn formulas having exactly one 
negative literal and at most k positive literals in each conjunct. A conjunct a = 
(vq V vi V Vi V • • • V v m ) of some /c-anti-Horn formula is called a k- anti-Horn clause 
and can be written as a = (vq — > (v% V i>2 V • • • V v m )). We will always assume that 
V\,... ,v m are pairwise distinct. We refer to vo as the left-hand side of a and to 
{t>i,... ,v m } as the right-hand side of a (RHS(q) for short). Note that we allow 
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empty right-hand sides, i.e., /c-anti-Horn clauses of the form (vq — >); this is equivalent 
to (vo)- We write a A;-anti-Horn formula as the set of its clauses. 



We use the definitions and notations of Ladner, Lynch, and Selman [LLS75] for 
polynomial-time reductions. However, in case of <£ tt and <J? we use the follow- 



ing alternative definitions which are equivalent to those in | LLS75| . (For notational 



simplicity, henceforward whenever we write reduction we will mean polynomial-time 
reduction.) 

Definition 3.1 Let i,BCS* be arbitrary languages. 

1. A bounded truth-table reduces to B (denoted A<^ tt B) if and only if there 
exists a constant k > 1 and a polynomial-time machine that, given an arbitrary 
word x, computes a list of words yi, ... ,yk and a k-ary boolean formula Q x in 
conjunctive normal form such that each conjunct contains at most k literals, 
and x G A <^ x (xB(yi), ■ ■ ■ ,XB(yk))- 

2. A conjunctive truth-table reduces to B (denoted A<c B) if and only if there 
exists a polynomial-time machine that, given an arbitrary word x, computes a 
(possibly empty, i.e., m = 0) collection Y x = {y\, . . . ,y m } such that x £ A 

(Vj : 1 < j < m)[xB(yi)]- 

In addition, we define the following. 

Definition 3.2 Let A, i3 C £* be arbitrary languages. 

1. Let k > 1. We say that A fe-anti-Horn reduces to B (denoted A <k-ab B) if and 
only if there exists a polynomial-time machine that, given an arbitrary word x, 
computes a list of words yi, ■ ■ ■ ,y n and an n-ary k-anti-Horn formula <fr x such 
that x G A $x(XB(yi), ■ ■ ■ ,XB(y n ))- 

2. A <ktt(c) B if an d on fy if there exists a language X such that A <£ tt X and 

p 

c 



X<?B. 



3. A <^ btt ^ B if and only if there exists a language X such that A<c X and 

4- ^ — d(btt) ^ $ an d on ty ^ there exists a language X such that A<^X and 
X< p htt B. 
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Proposition 3.3 Let A, B C S* be arbitrary languages. A <c(btt) B if and only if 
there exist a constant k > 1 and a polynomial-time machine that, given an arbitrary 
word x, computes a list of words yi, ■ ■ ■ ,y n and an n-ary boolean formula <& x in 
conjunctive normal form such that each conjunct contains at most k literals, and 

x^A <==>■ <Z> x (xB(yi), ■ ■ ■ ,XB(y n ))- 

We introduce the following abbreviated notation for the case when a set A re- 
duces to a set B in such a way that for each word x, a list of words yi, ■ ■ ■ ,y n 
and an n-ary boolean formula <& x (ai,... ,a n ) are computed such that x £ A 
& x (XB(yi), ■ ■ ■ i XB(yn))- Instead of considering the list of words y±, . . . , y n and the 
boolean formula <& x (ax, . . . , a n ) as separate objects, we combine them in a natural way 
into a boolean formula over words, i.e., we replace each occurrence of some variable a, 
in <& x (ai, . . . , a n ) by the word yi. For instance, if the reduction of some word x pro- 
duces the words j/i, ^2,2/3 and the boolean formula <fr x (ai, ai, 03) = [a,\ \l~a~2) r\{a\\la~^), 
then as a simplification we assume that the reduction produces the formula & x = 
(yi V 2/2) A (yi V yj). A boolean formula over words is said to be satisfied by a set 
S C S* if and only if this formula is satisfied when each occurring word y is replaced 
by the value Xs(y)- 



4 New Collapses to P NP for Subclasses of Truth- Table 
Reductions 



In this section we present the core result of [GlaOO], though with what we hope is a 



somewhat more accessible proof. In particular, if there exists a sparse ^^-hard 
set for NP, then the polynomial hierarchy collapses to P NP . From this result, the 
same collapse of the polynomial hierarchy from the existence of sparse <^ btt ) -hard 
or sparse <^ btt ^ -hard sets for NP can be shown to also hold (see the end of this 
section). 

Throughout this section, we consider only boolean formulas (respectively, boolean 
clauses) over words, k will always denote the parameter of <£_ ah reductions, p, q, r 
will denote polynomials, v, w, x, y, z will denote words from £*, a, f3, 7, 5, will denote 
fc-anti-horn clauses, T,A,0 will denote /c-anti-horn formulas, and C, Ci, £2, ■ ■ ■ will 
denote lists of fc-anti-horn formulas. 

We introduce the following binary relation on fc-anti-Horn clauses and /c-anti-Horn 
formulas. 
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Definition 4.1 For k-anti-Horn clauses 7 = (vo — > (v± V ■ ■ ■ V v m )) and 5 = (wq — > 
(ti>i V- • • Vu?„)), we wife 7 h 5 if and only ifvo 7^ u>o or {fi, . . . ,v m } C {-wi, . . . , io n }. 
For k-anti-Horn formulas T and A, we write T \- A if and only if for all 5 G A f/iere 
is some 7 G T mi/i 7 h 5. 

Note that h is reflexive, and it is even transitive if all considered clauses have the same 
left-hand side. It is easy to see that 7 h i5 and T h A are decidable in polynomial 
time. 

Theorem 4.2 For all k > 1, if SAT <£_^h reduces to a sparse set, then PH = P NP . 

Proof Let k > 1 and let S be a sparse set such that SAT Let p be a 

polynomial such that for all n > it holds that p{n) > 1, and S contains at most p(n) 
words having length at most n. Let <& x denote the /c-anti-Horn formula that occurs 
when reducing the word x to the sparse set S via the <k_^,h reduction mentioned 
above. Moreover, let q be a polynomial such that for all words x it holds that (i) & x 
does not contain more than </(|x|) fe-anti-Horn clauses and (ii) the length of words 
appearing in & x is bounded by </(|a:|). 

Our aim is to show that NP NP = P NP , as that implies that the polynomial hier- 
archy collapses to P NP . The proof has three parts, and in the first part we show the 
following claim. 

Claim A There exists a A?? algorithm LearnSat such that for all n G N and z G S* 
the computation LearnSat(O ra , z) returns a k-anti-Horn formula V with the following 
properties: 

(i) each clause 7 G V has the left-hand side z, 

(ii) V is satisfied by S, and 

(Hi) V h ® x for all x G SAT with \x\ < n. 

So the output r' of the computation LearnSat (0 ra , z) allows a forecast concerning 
queries to SAT of length at most n, in such a way that elements of SAT are treated 
correctly. Then, in the second part of the proof, we use Claim A to show the following. 

Claim B There exists a AJ? algorithm LearnAll that, on input 0™, returns a list 
of k-anti-Horn formulas, C n , such that for all words x G £*, \x\ < n, it holds that 
xGSAT^(VrG£ n )[rh$J. 
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In other words, LearnAll(O n ) returns a list of /c-anti-Horn formulas, C n , such that 
each x G SAT with |x| < n is forecast as "satisfiable" by all elements of C, and for 
each x ^ SAT with \x\ < n there is an element of C giving a negative forecast. So with 
C n we can forecast queries to SAT of length at most n, in such a way that all queries 
are treated correctly. Finally, in the third part of the proof, we use the algorithm 
LearnAll to show that each language from NP NP can be accepted by a A?? algorithm. 



This implies NP = P NF . 
PART I: 

We start with the listing of the algorithm LearnSat, which works on inputs of the 
form (0 n , z) with n G N and z G S*. 

1. Algorithm: LearnSat (0 n , z) 

2. r := {(z - z)} 

3. for i := to (p(g(n))( fc+1 ) + l) k 

4. if there exists x G SAT with \x\ < n and T \/ & x , then determine 



the smallest such x, call it x, else return T and stop, endif 
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choose a fc-anti-Horn clause <5 6 $j such that there is no 7 G T 
with 7 h <5 



6 



r 



(r\{ 7 Gr|5h 7 })u{<5} 



7. 



if 



r|| =j3(g(n)) fc+1 then 
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choose P, 7 G T and a /c-anti-Horn clause a such that /3 7^ 7, 
a is satisfied by S 1 , a h /3, a h 7, and q, /?, 7 have the left- 
hand side z 
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r := (r \ {7 G r I a h 7}) U {a} 
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endif 
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11. next i 



12. remark this step will never be reached 

Claim Al Let n £ N and z€S*. Then, after the initialization of the variable T in 
step the following holds at the end of each step of the computation LearnSat(O n , z) . 

(i) r is a set of k -anti- Horn clauses with 1 < ||r|| < p(q(n)) k+1 . 

(ii) All words that appear in elements ofT are of length at most max{g(n), \z\}. 
(Hi) All clauses ofT have the left-hand side z. 

(iv) //7i,72 € r and 71 h 72 then 71 = 72. 

Right after step |2| it holds that T is a set of fc-anti-Horn clauses. This is preserved 
by step |6|, since i5 6 and <£ x is a /c-anti-Horn formula. Also step |9| preserves this 
property since, by the choice of a and j3 in step ||, it holds that RHS(a) C RHS(/3). 
Moreover, right after step [2| we have ||r|| = 1, step |6| increases ||r|| at most by 1, 
and if ||r|| = p(q(n)) k+1 > 1 then step || decreases ||r|| by 1. This shows the first 
statement of the claim, and analogously we can show the second one. For the third 
statement we note that if all clauses of T have the left-hand side z, then the choice of 
6 in step || implies that it has also the left-hand side z. Finally, using statement (hi), 
we can show statement (iv) analogously to the first statement. This proves Claim Al. 

Claim A2 IfT, right before the execution of step^, is satisfied by S, then the choice 
of a, P and 7 in step is possible and can be carried out in time polynomial in 
max{n, \z\}. 

So assume that we are right before the execution of step ||, and that T is satisfied 
by S. For < i < k let Tj be the set of anti-Horn clauses 7 G T such that 
there appear exactly i words on the right-hand side of 7. From Claim Al(iii) it 

follows that ||r || < 1. Since Ei=o ^ < hj+1 for a11 h >3 e N with h ^ 2 > the 
condition ||r|| = p(q(n)) k+1 in step [?] implies that there exists an m > such that 
I |r m | I > p(q(n)) m . We use this fact in the following subprogram that shows a possible 
implementation of step ||. It assumes a read access to the program variables T and z 
of LearnSat, and it returns the required values a, (3, 7. 



9 



• r» := {7 G r| ||RHS(7)|| = i} for < i < k 

• j := and let m be the largest m > such that ||r m || > p(q{n)) m 

• repeat 

• if j = then aj := (z — >) else ay := (z — > (y± V 2/2 V • • • V yj)) endif 

• j ■= j + 1 and Aj := {7 G T A | a 3 -_i h 7} 

• choose a word j/j ^ RHS(ctj_i) that appears in a maximum number 
(note: set rij to that number) of the right-hand sides of clauses in Aj 

• let a := choose disjoint /c-anti-Horn clauses /3,7 G Aj and stop. 

By Claim Al(iii), it holds that all elements of C V have the left-hand side z. Thus 
Aj+i = {7 G I 2/1,2/2,--- , Vj appear at the right hand side of 7}. So, as long as 
the algorithm does not stop, it holds that ||Aj+i|| is equal to rij, and ||Aj+i|| > 
||Aj||/p(g(n)). If we reach the m th pass, we have cc^-i = (2/1, 2/2, - - - , 2/m-i) which 
in turn implies = 1 (note that the right-hand sides of clauses in r^h consist of 
exactly rh elements, and we do not have two or more identical clauses since T^h is 
a set). So we obtain \\Am\\/p(q(n)) > \\Trh\\/p(q(n)) m > 1 = rim, and it follows 
that the algorithm leaves the loop at the latest after the m th pass. Assume that the 
algorithm leaves the loop right after the j' th pass with 1 < j' < rh. Then we have 
HAj-'ll > I ir^l 1 > WTrhW/piqin))* > 1 (note that A x = T m ). Thus, when 
we have left the loop, it holds that j = j' , and there exist two disjoint /c-anti-Horn 
clauses 0, 7 G Aj/. Since the loop's body is passed through at most rh < k times, 
and each single step can be carried out in time polynomial in max{n, \z\} (note that 
by Claim Al we have ||rj| < p(q{n)) k+1 and all words appearing in elements of V 
are of length at most max{g(n), |z|}), it follows that the above subprogram works in 
time polynomial in max{n, \z\}. Moreover, it returns distinct /?, 7 G T m C T and a 
/c-anti-Horn clause a such that a h (3, a h 7, and a,/?, 7 have the left-hand side z. 

So it remains to show that if the algorithm stops after the j /th pass, then Uj'-i = 
(z — > (2/1 V 7/2 V • • ■ Vyj'_i)) is satisfied by 5. Suppose that the subprogram stops after 
the j /th pass, and that ctj'-i is no£ satisfied by S 1 , i.e., z G S 1 and 2/1 , - - - , 2/j'-i ^ <5- 
We know that all clauses of T have the left-hand side z, and by assumption, T is 
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satisfied by S. It follows that each 7 G Aji C r contains a word from S on its 
right-hand side (remember that these words are no longer than q(n)). There are at 
most p{q{n)) words in S that are no longer than q(n). By a pigeon-hole argument 
there exists at least one word y'j, G S such that < g(n) and y'-; appears in the 
right-hand side of at least ||Aj/||/p(g(n)) elements of Aj>. From yi, ... ,yj'-i ^ S it 
follows that y'j, £ RHS(aj'_i) and ny > \\Af\\/p(q(n)) which is a contradiction to 
our assumption that the algorithm stops after the j' th pass. This proves Claim A2. 

Using a SAT oracle in combination with binary search, step || of LearnSat can be 
carried out in time polynomial in maxjra, \z\} (note that the size of T is polynomially 
bounded by Claim Al). By Claim A2, also step |8| can be carried out in time poly- 
nomial in max{re, \z\}. This shows the first part of Claim A, i.e., that LearnSat is a 

algorithm. The remaining part is shown in the following claim. 

Claim A3 LearnSat(O n , z) returns a k -anti- Horn formula V such that 

(i) each 7 G T' has the left-hand side z, 

(ii) V is satisfied by S, and 

(Hi) T' h $ x for all x G SAT with \x\ < n. 

Assume for the moment that LearnSat(O n , z) returns some V , i.e., 
LearnSat (0 n , z) stops in step ||. From Claim Al it follows that statement (i) holds, 
and that V is a /c-anti-Horn formula. Clearly, V is satisfied by S after its initializa- 
tion in step ^, and step |6] preserves this property, since <5 £ $f and x G SAT. From 
the choice of a in step |8| it follows that step || also preserves the property that T is 
satisfied by S. This shows (ii). Since the algorithm stops in step ||, we have V h <J> X 
for all x G SAT with |x| < n. This shows (iii). 

So it remains to show that LearnSat(O n , z) stops in step |j. Let us assign a 
weight w{9) to each fc-anti-Horn clause 6, 6 = (vo — ► (t>i V V2 V ■ ■ ■ V Vj)), such that 
w(9) is greater than the sum of the weights of p(q(n)) k+1 (i.e., the number bounding 
||r|| in LearnSat(O n , z)) fe-anti-Horn clauses that have more than j words on their 
right-hand side. For a fe-anti-Horn clause 6 and a A;-anti-Horn formula we define 
w {9) = (p(g(n))( fc+1 ) + 1) ( fc ~H RHS WID an d w (@) = J2 ee@ w{6). 

By Claim Al(iii), in each step of the computation LearnSat(O n , z) it holds that all 
clauses of T have the left-hand side z. From the fact that 5 (in step|6|) and a (in step ||) 
have the left-hand side z, it follows that {7 G T \ 5 h 7} = {7 G T \ RHS(5) C RHS(7)} 
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(in step |) and {7 G T \ a h 7} = {7 G T | RHS(a) C RHS(7)} (in step |). From 
the choice of 5 (respectively, a) it follows that it was not in T, right before step ||] 
(respectively, step |9|) . For 5 this holds by its definition in step ||, and for a this is due to 
its definition in step |^ and Claim Al(iv). Thus, in step || (respectively, step ||) we add 
one new clause 5 (respectively, a) to T, and simultaneously we delete all clauses 7 G T 
such that RHS(5) C RHS( 7 ) (respectively, RHS(a) C RHS( 7 )). From Claim Al(i) 
it follows that both steps increase the value of w(T). Hence, each pass through 
the loop of LearnSat(O n , z) increases w(T). We reach the highest possible value 
W (T) = (p(g(n))( fc+1 ) + l) k when T = {(z ->)}. At least at this point LearnSat(O n , z) 
stops in step [|. Since we start with T = {(z — > z)} and w({(z —> z)}) > 0, we 
actually reach the end of the body of the loop at most (p(q(n))( k+1 ^ + l) fe times. 
Thus LearnSat(O n , z) stops in step ||[ This proves Claim A3. 



This shows Claim A and completes the first part of the proof of Theorem 4.2. 
PART II: 

In this part we prove Claim B, i.e., we construct a AiJ algorithm LearnAll and 
show that on input n this algorithm will compute a list C n of /c-anti-Horn formulas 
Ti, T2, ■ ■ ■ , T m such that the following holds for all words x of length at most n, 

x g SAT <^> (r^ h <$> x for all 1 < i < m). 

We give the listing of the algorithm LearnAll, which works on inputs of the form n 
with n G N. 

1. Algorithm: LearnAll(O n ) 



2. for % = 1 to n 



3. C:=% 

4. while there exists an x ^ SAT with |x| < i such that r h $> x for 
all T G C 

5. let x, be the smallest x that satisfies this condition 
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6. 



for each word v that appears on the left-hand side of some 
7 € add := LearnSat(0\ v ) to the list C 



7. 



endwhile 
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9. next i 
10. return C n 

We want to show that LearnAll can be carried out in polynomial time if we are 
allowed to ask queries to a SAT oracle. To do so, we first show that the number 
of passes through the while loop is bounded. Then we look into each single step of 
LearnAll and show that it can be carried out in polynomial time (with SAT as an 
oracle) . 

Claim Bl For any fixed i (of step\^), the body of the while loop (steps is passed 
through at most p(q(i)) times. 

If the condition in step || is satisfied, then, right before step || we have x ^ SAT, 
\x\ < i, and r h f j for all T € C Since x ^ SAT there exists some (3 = (vq — > 
(yi V Vi V • • • V vi)) € <3?x that is not satisfied by S, i.e., vo € S and v%, i?2, • • • , «j ^ S. 

From Claim A it follows that (right before step ^) for each r £ C it holds that all 
clauses of V have the same left-hand side and V is satisfied by S. Choose an arbitrary 
r £ £, and let z be the left-hand side of the elements of V. We want to show that z ^ 
vq. From T h it follows that there exists some 7 = [z — > (i«i V W2 V • • • V w m )) € T 
such that 7 h f3. If z = vq then we have RHS(7) C RHS(/5). Thus u>x, u>2, ■ ■ ■ , w m ^ S 
and z = vo € S. This contradicts the fact that 7 is satisfied by S. So z ^ vq, and it 
follows that, in each execution of step ^, we add to the list C at least one whose 
elements have the left-hand side t>o such that (i) vq does not appear as a left-hand 
side in some T that was on the list C before, (ii) \vo\ < q(i), and (iii) vq S S. This 
proves Claim Bl since the number of words in S of length at most q{i) is bounded by 
p(q(i)). 
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Claim B2 Consider the computation LearnAll(O n ) for n > 1, and let 1 < j < n. 
Then Cj (after its definition in step^) is a list oj ' k- anti-Horn formulas such that for 
all words x of length < j it holds that x G SAT -<=>■ (Vr G £j)[r h $ x ]- 

If step || is executed for i = j, then the condition in step || is false. Hence, 
for all x of length at most j it holds that (VT G Cj)[V h $ x ) => x G SAT. On 
the other hand, from Claim A it follows that for all x of length at most j we have 
x G SAT => (Vr G £j)[r h This proves Claim B2. 

Claim B3 Using SAT as an oracle, LearnAll(O n ) can be carried out in time poly- 
nomial in n. 

By Claim Bl, it suffices to show that each single step of LearnAll(O n ) can be 
carried out in time polynomial in n. First of all we have a look at step ||. Since 
\v\ < q{\x\) < q(i) we obtain from Claim A that LearnSat(0*, v ) can be carried out in 
time polynomial in i (with SAT as an oracle). It follows that the whole step ^ can be 
carried out in time polynomial in i < n. (Note: we are at times being a bit informal 
regarding the uniformity that holds regarding our "[is]... polynomial in" claims, but 
this is a common informality and our meaning should be clear.) 

Now let us see that we can test the condition in step |] with one query to SAT. 
For i = 1 this is trivial (without asking any question). If i > 1 then we have already 
computed the list Ci—\. By Claim B2, this list allows us to decide x G SAT in 
polynomial time for words x of length at most i — 1 (note that by Claim A and 
Claim Bl, the size of is polynomial in i). So using the fact that SAT is self- 
reducible, allows us to decide x G SAT in time polynomial in i for words x of 
length at most i. Let be the polynomial-time machine that achieves this, i.e., on 
input (0*,x,£j_i) with \x\ < i it decides x G SAT. So the condition in step || is 
equivalent to the following one: 

(3x G £* : |x| < i)[{O l ,x,£i~i) £ L{N) A (VT G C)[T h $ x ]]. 

Since this is an NP condition, it can be verified with one query to SAT. This shows 
that we can test the condition in step || with one query to SAT. Analogously one 
shows that step || can be carried out in time polynomial in i by asking queries to SAT 
(we perform a binary search here). This proves Claim B3. 



This completes the second part of the proof of Theorem [4.2| , since Claim B follows 
from the Claims B2 and B3. 
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PART III: 



So far we have shown that if SAT reduces via a <£_ ah reduction to a sparse set S, 
then there exists a A?J algorithm LearnAll such that, for all n, LearnAll(O ra ) returns 
a list of fc-anti-Horn formulas and this list has the nice property that with its help we 
can answer queries to SAT of length at most n in polynomial time. In the third part 
of this proof we exploit this property to show that each language from NP NP can be 
accepted by a Ar? algorithm. As is standard, this implies a collapse of the polynomial 
hierarchy to P NP . 

Suppose we are given an arbitrary nondeterministic oracle Turing machine 
and a polynomial r bounding its computation time. We define a new machine M' 
working on inputs of the form (x, C) where x is a word and £ is a list of £;-anti- 
Horn formulas (computed by LearnAll). On input (x, C) the machine M' simulates 
the computation M^'\x) with the modification that queries q are replaced by tests 
(VT € £)[r h $q]. It is easy to see that L(M') £ NP, since the mentioned test can be 
carried out in polynomial time. 

Now we can describe a deterministic polynomial-time Turing machine 

with SAT oracle, which is such that iV"( SAT ' 1 accepts the same language as doesM( SAT ). 
On input x, the machine works as follows: 

(i) Determine C = LearnAll(O r(|a:|) ). 

(ii) Accept if and only if (x,C) G L(M'). 

By Claim B, (i) can be done in polynomial time with queries to SAT. Since L{M') G 
NP, (ii) can be verified in polynomial time with one query to SAT. This shows that 
jV"( SAT ) works in deterministic polynomial time. Furthermore, by Claim B we have 
q € SAT (VT G C)[T h <3? g ] for all words q of length at most r(|x|). Since 

M( SAT )(x) can only asks queries of length at most r(|x|), the computation M^ SAT \x) 
is equivalent in outcome to that of M'(x, C). It follows that L(7V( SAT )) = L(M ( SAT )). 
This shows P NP = NP NP , and it follows that P NP = PH. ■ 



The following theorem strengthens Theorem 4.2, i.e., it holds that P = PH even 



if SAT reduces via a <^ btt \ reduction to a sparse set. Here the reduction formulas 
are (unbounded) conjunctions of formulas of bounded length. 

Theorem 4.3 //SAT reduces via a <p/ btt % reduction to a sparse set, i/ienP NP = PH. 
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Proof Suppose SAT reduces via a <p/ btt \ reduction to a sparse set S. From Propo- 



sition |3.3| it follows that there exist a constant k > 1 and a polynomial-time machine 
that, given an arbitrary word x, computes a boolean formula of words, <& x , in con- 
junctive normal form such that each conjunct contains at most k words, and x £ SAT 
if and only if <& x is satisfied by S. Observe that there is a bijection / : (S*)° U (X*) 1 U 
• • • U — > S* that is polynomial-time computable and polynomial-time invert- 
ible. For < % < k, let S{ = f{S % ) = {f(wi,W2, ■ ■ ■ , wi) | w\, u>2, ... , Wi S S}, and 
note that, for each < i < k, S l is sparse since S is sparse. Since / is polynomial-time 
invertible, one can also show that Si is sparse. Thus S' = So U Si U • • • U Sf. is sparse. 

We want to show that SAT ' , S 1 . To do so, we consider the conjuncts of & x 
for some word x. For each conjunct we perform the following transformation: 



(wTVWV---VWVw;iVu;2V---Vu; i ) i-» (f( Vl ,v 2 ,... , Vi ) V/(wi) V/(w 2 ) V- • • Vf(wj)). 

It is easy to see that this transformation can be carried out in polynomial time. 
Moreover, it holds that a conjunct is satisfied by S if and only if the transformed 
conjunct is satisfied by S' (note that V\ £ S V • • • V Vi S -4=^ /(t>i, . . . , V{) £ S', and 
w € S -4=^ f(w) G 5' for all words w, v\, . . . , Uj). This shows that SAT reduces via a 
— k-ah reduction to a sparse set S'. From Theorem gj ^ follows that P NP = PH. ■ 

The proofs in this section show even more: It turns out that we can replace 
SAT by any polynomial-time length-decreasing self-reducible set. (A language T is 
called polynomial-time length-decreasing self-reducible if and only if there exists a 
polynomial-time oracle machine M^'> such that the language accepted by is 
equal to T, and on input x this machine queries the oracle only about words w with 
\w\ < \x\). 

Theorem 4.4 If a polynomial-time length- decreasing self -reducible set T reduces via 
a <c(t>tt) reduction to a sparse set, then NP T C P NP . 

Note that if T is a polynomial-time length-decreasing self-reducible set, then also 
the complement T is polynomial-time length-decreasing self-reducible. Moreover, 
T <j( btt ) 5" implies T <^(btt) ^ ^ or air ^ se * ^ Ci us t by negation of the reduction formu- 



las). Thus for <^ btt ) reductions a result analogous to Theorem iA holds. 
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5 On Theorem 



As noted earlier, Theorem |1.1| follows from the techniques of [CNS95, Theorem 10], 



as has been mentioned by [ vM97] , BFT97 , SivOOfl . For completeness, we note that one 



can see this, for example, as follows. If SAT disjunctively reduces to some sparse set 
S, then also the following set L G NP disjunctively reduces to S say via some function 
/ G FP (cf. [|CNS95| , Theorem 10]). Let 



L = {(^,l m ,u,v) | 

71—1 

m = 2-3', u,v G GF(2 m ), (3a = (a 0) ... ,a n _i))[V>(a) A^am 1 = v}}, 

i=0 

where ip is an n-ary boolean formula, a an assignment for ip, and u and v are elements 
of the finite field that has 2 m elements (m is of the form 2-3 1 for some I > to guarantee 
that this field exists). We assume that for some given word x, f(x) is a set of words 
(that is interpreted as a disjunction of words). 

Now we follow the proof of Theorem 9 which can be found in Appendix B 



of [CNS95]. Let q be a polynomial such that for all n f ,m>0 and all boolean formulas 
ip of size n' it holds that the words in f((ip, l m ,u,v}) are of length at most q(n',m). 
Moreover, let p be a polynomial such that the number of strings in S of length at 
most q(n',m) is bounded by p(n',m) for all n',m > 0. 

Let 4> be an n-ary boolean formula of size n' > n that has exactly one satisfying 
assignment; we will determine this assignment. Choose the smallest suitable m (i.e., 
m = 2 ■ 3' for some I > 0) such that 2 m /p(n', m) > n, call it rh, and let F = GF(2 m ). 
Note that rh = O(logn'). 

Instead of estimating probabilities as it is done in the original proof of 



Theorem 9 | CNS95|| let us proceed as follows. For all u, v G F we compute 



f((4>,l m ,u,v)) in polynomial time. Since <j) has exactly one satisfying assignment, 
for each u G F there is a unique v u G F such that (eft, l m ,u,v u ) G L. For each u, let 
S u = Ui)e-F/((^' l rr \ M ' u ))- Now observe the following facts. 

1. For each u G F there exists an s G S (~1 S u with |s| < q(n', rh). 

2. The number of elements in S that are of length at most q(n' , rh) is bounded by 
p(n', rh). 

It follows that there is some w G S that appears in at least 2 m /p(n',rh) > n sets S^. 
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For each word w that appears in at least n sets S U1 , • • • , S Un , we do the following: 
We determine corresponding words v 1, . . . ,v n , such that w € f((4>,X m ,Ui,Vi)). Then 
we solve the following equation for a = (oq,oi, . . . , a n -i) (this is possible since we 
have a Vandermonde matrix). 

••• (u n )° \ 



/ («i) c 



(oq, ai, . . . , a n _i) 



"2 



M 1 (tia) 1 



\ K)^ 1 («2) 



\n-l 



M 1 



K^ 1 y/ 



(1) 



Finally we check whether a is a satisfying assignment for cf> and output a in this case. 

Note that if we reach some w € S, then all corresponding ((f), l m , m, vi) are el- 
ements of L. By the definition of L and the fact that <j) has exactly one satisfying 
assignment (ao,ai,... , a n _i), we have Y2j=o a j u l = u « f° r an So if w £ S, then 
(||) is a valid equation. Thus, we really do find the satisfying assignment of (f>. This 
shows that (3Q)[USAT Q G P]. 
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